support TORCH_CUDA_ARCH_LIST and avoid link against libcuda.so at compile time#245
support TORCH_CUDA_ARCH_LIST and avoid link against libcuda.so at compile time#245winggan wants to merge 6 commits intothu-ml:mainfrom
Conversation
support TORCH_CUDA_ARCH_LIST so we can compile wheel in a non-GPU environment, making it more convenient to working with CI/CD and prebuilt binary distirbution
…ver API is loaded successfully
|
@winggan Thanks man it worked!! can you keep your branch updated please for anyone who wants try this PR: |
it has been a while since I created this PR which should cover most compile time issue at that time. could you please explain briefly what has been updated by now and how can I help you, because I haven't been keeping up with the project updates for a while due to work commitments. |
support TORCH_CUDA_ARCH_LIST so we can compile wheel in a non-GPU environment, making it more convenient to working with CI/CD and prebuilt binary distirbution
now we can build SageAttention in a cpu-only environment using (as an example):
CUDA_HOME=/path/to/cuda-x.y TORCH_CUDA_ARCH_LIST='8.0;9.0+PTX' python3 setup.py bdist_wheellibcuda.so (the driver API) is not available in a non-GPU environment, so we should avoid directly link against it at compile time.
NVIDIA has offered a standard way to access driver API via
cudaGetDriverEntryPointByVersion(or previouslycudaGetDriverEntryPoint) to dynamically load and call driver API